Breaking the Black Box

The Power of Chain-of-Thought (CoT) Prompting

Teem KWONG

Land Acknowledgement

I would like to begin by acknowledging that we are on the traditional, ancestral, and unceded territory of the xʷməθkʷəy̓əm (Musqueam), Sḵwx̱wú7mesh (Squamish), and Tsleil-Waututh peoples. I am thankful to have the opportunity to live and learn on this land.

The “Black Box” Problem

Why standard LLMs fail complex tasks

  • Early users treated LLMs like search engine.
  • Models are great at facts, but struggle with multi-step reasoning.
  • Models return a wrong answer without reasoning.

What is Chain-of-Thought (CoT)?

  • A technique where we ask an AI model to explain its reasoning before giving the answer.
  • Instead of jumping to conclusions, the model unfolds its “thought process”.

Example

Question: Roger has 5 balls. He buys 2 cans (3 balls each). Total?

Standard Prompting Chain-of-Thought Prompting
Output: “7” Step 1: 5 balls.
Result: Incorrect Step 2: (2 cans × 3) = 6 balls more.
Step 3: 5 + 6 = 11.
Answer: 11 (Correct)

Why? The model uses its own output as “new context” for the next step.

The LEGO Principle

1. Logic Decomposition

  • LEGO castle
  • Massive problems into smaller and manageable “blocks”.
  • Build the baseplate \(\rightarrow\) Walls \(\rightarrow\) Towers.

2. External Memory

  • Write down a reasoning step, like placing a LEGO brick firmly into the baseplate.
  • Once that brick is placed, the model can “see” it.

3. Error Catching

  • Put a 2x4 brick where a 2x2 brick should be, you will notice immediately.

Variations of the CoT Pattern

1. Zero-Shot CoT

The “Magic” Phrase: “Let’s think step by step.” No examples required.

Image Source: Kojima et al. (2022)

Variations of the CoT Pattern

2. Few-Shot CoT

Providing Exemplars. Showing the model 2-3 solved problems with worked-out logic.

Image Source: Kojima et al. (2022)

Variations of the CoT Pattern

3. Self-Consistency

The “Majority Vote.” Generate 5 different paths; if 4 lead to “11” and 1 leads to “7”, choose 11.

Image Source: Wang et al., 2022

Advantages & Trade-offs

The Benefits

  • Transparency: Essential for Medical Diagnosis and Financial Decisions.
  • Precision: Stays “on track” during long generations.

The Costs

  • Resource Overhead: Higher token count = Higher cost.
  • Latency: “Thinking” takes time. Not ideal for real-time.

Conclusion:

  • From Answer Machines to Reasoning Engines.
  • Many modern LLMs adopt this technique.

Thank you!

References: